Robotic Camera System

Max McCarthy [msm296]
Luke Forsman [ltf25]

Objective

The goal of this project was to use the high-definition camera module and lens to emulate an industry DSLR camera with auto-focus capabilities. Features include image capture and storage, video monitoring via TFT display, and automated focus—with touchscreen subject identification.

Generic placeholder image

Project Video

Introduction

The project can be broken down into 3 components: 1) implementation of the camera module to monitor camera input and capture images, 2) design of a hardware system that performs hands-free rotation of the lens focus ring given a control signal, and 3) creation of an algorithm for focusing the camera automatically.

The foundation for the camera system is controlled using the Raspberry Pi’s Python camera and PyGame libraries. Data received from the camera module is printed to the TFT to monitor what is within the frame of the lens. A push-button gives the user the ability to capture the image of that frame; the image is then stored under a unique file name.

The hardware system has the sole purpose of mounting and rotating the lens focus ring via servo motor. A tripod supports the camera module, the lens, and the servo. A PWM signal controls the angle of the servo, which adjusts the lens.

The focusing algorithm uses the laplacian to check the sharpness of the image and gradually steps through different focal distances to empirically determine the best sharpness value. It also allows a user to focus on a specific part of the frame by tapping on the touch screen.

Design & Testing

Before we could implement the auto-focus, we had to decide what our system would look like. Designing the system consisted of acquiring a tripod that we could mount the camera module on and creating a makeshift cardboard attachment to secure the motor. The system went through a few iterations.

In our earlier attempts, we tried to rotate the lens by pulling a pin attached to its focus ring; these included the use of rubber bands, wires, and a cardboard belt. In each case, a servo motor with an attached wheel served as the driving force.

Eventually, we replaced the cardboard servo frame with a piece of laser-cut acrylic; this reduced unwanted movement of the servo significantly. We also replaced the pin-pulling method of rotation with a simpler and much more effective friction-based method. The motor and wheel are held close enough to the lens to turn it with friction; we wrapped a rubber band around the circumference of the wheel for extra traction.

While setting up the system, we experimented with the camera module to see how we could implement the basic camera features (i.e. image capture/storage and constant video monitoring). After trying a few different methods, we settled on one using the Python PiCamera and BytesIO modules. We set up a standard PyGame screen (as we did in previous labs) and used BytesIO() to store the image in an RGB format; we then blit the image onto the TFT.

camera = PiCamera()
camera.resolution = (320,240)
 
rgb = bytearray(camera.resolution[0] * camera.resolution[1] * 3)
 
x = (screen.get_width() - camera.resolution[0])/2
y = (screen.get_height() - camera.resolution[1])/2
 
stop = False
while not stop:
    my_stream = BytesIO()
    camera.capture(my_stream, use_video_port=True, format='rgb')
    my_stream.seek(0)
    my_stream.readinto(rgb)
    my_stream.close()
 
    img = pygame.image.frombuffer(rgb[0:(camera.resolution[0] * camera.resolution[1] * 3)], camera.resolution, 'RGB')
 
    screen.fill(BLACK)
    screen.blit(img,(x,y))

We use a standard servo motor to rotate the lens to specific angles. This grants us greater precision than an odometric method with a continuous servo. In our final code, we vary the duty cycle between 1.5% and 5.5with 3.5% at the center of the servo’s range of motion. In our set up, we align this center with the center of the focus ring’s range of motion.

Once we built the physical system and implemented the basic camera features, we addressed the auto-focus. We repurposed our existing video monitoring code to evaluate the sharpness of the frame at any given point in time. We use the OpenCV Laplacian function to perform a convolution with a Laplacian kernel for edge detection. We then use the .var() extension to get the variation of the result; the higher the number, the sharper the edges are in the image. Comparing these values across time yields the best setting for the focus.

The process of focusing starts by measuring the value at the current position and the positions one step to each side; the position with the highest value sets the direction of rotation. The servo then rotates the lens until the focus values no longer increase as compared to each preceding step. A lower value indicates that the previous step was a local maximum, so we rotate the lens back one step to settle at that peak.

def check_focus():    
    my_stream = BytesIO()
    camera.capture(my_stream, use_video_port=True, format='rgb')
    my_stream.seek(0)
    my_stream.readinto(rgb)
    my_stream.close()
    image = pygame.image.frombuffer(rgb[0:(camera.resolution[0] * camera.resolution[1] * 3)], camera.resolution, 'RGB')
    pygame.image.save(image,'focus_temp.png')
 
    screen.fill(BLACK)
    screen.blit(image,(x,y))
    pygame.display.flip()
 
    temp = cv2.imread('focus_temp.png', cv2.IMREAD_UNCHANGED)
    # middle = [95:145,135:185]
    gray = cv2.cvtColor(temp[95:145,135:185],cv2.COLOR_RGB2GRAY)
    focus = cv2.Laplacian(gray,cv2.CV_64F).var()
    return focus

We realized through experimentation that the focus score accounts for the entire frame, leading to cases where the system actually focuses on the background. The user-defined subject focus addresses this issue. When the user taps the touchscreen, a 50x50 pixel area becomes the target of the focus algorithm. A small square flashes on the TFT indicating the subject area and signaling the initiation of the focus algorithm. One of the push-buttons on the TFT applies the auto-focus to the center 50x50 area of the frame. Here are some images to demonstrate the contrasting focus:

Generic placeholder image Generic placeholder image

Much of our testing involved trying to achieve different focal distances. We wanted to see if the system was focusing properly for a wide range of cases. We found that most issues occurred at the bounds of rotation. At one point we repositioned the lens to sit 90 degrees from its original alignment to more securely tighten it to the camera module, but this misaligned the motor and lens boundaries. In order to find the new boundaries, we used a program to rotate the motor stepwise with two directional buttons. Once the new bounds were identified, the system worked better than it had originally.

Results & Conclusions

We created a functional camera system with all of the features outlined in our objective. It can store and capture images, display a monitoring feed to a TFT, and can automatically focus on specific portions of the frame as dictated by user input.

The focus system is fairly reliable, but could be improved. We believe most of the unreliability is due to the jitter of the servo. This extra movement makes it more difficult to precisely identify a target position.

Our original plan to rotate the lens involved the use of a string or cardboard belt to pull an attached pin. These attempts failed miserably, but the revised plan involving frictional rotation with a rubber band was quite effective.

Future Work

One of the first things to address would be the PWM control signal. Our software implementation is highly unstable, and makes controlling the hardware system less reliable. Using a hardware PWM control signal would greatly reduce servo jitter, and improve the consistency of the auto-focus.

Less important, but still worth noting is the use of a continuous rotation servo instead of a standard servo. Our original method of focusing required that we limit our angular range to avoid damaging the lens. Since the current method is purely friction-based, we are able to rotate the wheel beyond the range of the lens focus ring without risk. The primary incentive for switching is that continuous rotation suits our focus algorithm much better.

The other thing we would have liked to implement is a menu feature which allows the user to browse captured images directly on the TFT without having to end the process. This is a feature that comes standard with any decent camera, so having it as part of our system would be a great addition.

Budget

TOTAL: $100.00

Team

Generic placeholder image

Max McCarthy (Left)

Hardware Design


Luke Forsman (Right)

Software Design